Universal Access To All Knowledge
Home Donate | Store | Blog | FAQ | Jobs | Volunteer Positions | Contact | Bios | Forums | Projects | Terms, Privacy, & Copyright
Search: Advanced Search
Anonymous User (login or join us)

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: Face_ Date: Sep 29, 2008 11:41am
Forum: web Subject: Re: Help Needed Quick

Unfortunately, the Wayback machine is either unnable or unwilling to archive YouTube videos (I suspect unwilling, because that will be a LOT of data). But even if it was, it won't store YouTube userpages. YouTube itself does not permit this and, to my knowledge, never did. See: www.youtube.com/robots.txt. Are you sure that they did allow it for some time?

Another future TODO of medium prio: a few checks when submitting URLs through "Archive That", like checking if there's actually something on the given location, and if archiving is permited. Currently, it doesn't check anything.

Reply to this post
Reply [edit]

Poster: Classic_TV_and_Radio_Fan Date: Sep 29, 2008 12:21pm
Forum: web Subject: Re: Help Needed Quick

But webcite archives it fine and it doesn't allow robots.txt-blocked pages to be archived.

Reply to this post
Reply [edit]

Poster: Face_ Date: Sep 30, 2008 5:43am
Forum: web Subject: Re: Help Needed Quick

The WebCite bot doens't care about robots.txt files. A bot is not obliged to follow it, after all. Only when it's programmed to do so.

I found out what might be the source of your confusion. YouTube's robots.txt disallows bots to access "www.youtube.com/user/{username}", but it allows bots to access "www.youtube.com/{username}". In other words, this is allowed:


But this is not:


It's something of a 'security flaw'. Not that it makes much difference for you though. Neither the Wayback machine nor Webcite actually stores YouTube videos, so using them to archive YouTube is pointless.